Direct replication

New results

Mean Pre shift Sd Mean Reversal shift Sd Mean Non-reversal shift Sd F Value p df
UC 842 42.51 10.9 126.9 137.52 34.97 6.97 44.57 >0.001 198
UC 822 150.56 177.26 342.1 192.08 114.9 155.45 84.54 >0.001 198
C 842 49.75 13.08 65.68 8.83 53.52 13.94 54.3 >0.001 198
C 822 72.6 18.15 85.42 15.27 68.64 26.14 30.73 >0.001 198

Results raijmakers

Mean Reversal shift Sd Mean Non-reversal shift Sd F-Value p df
UC 842 183.13 29.42 78.13 15.74 990.44 0.0001 198
UC 822 209.58 33.90 103.62 34.30 6.75 0.0101 198
C 842 151.51 25.25 142.16 25.65 482.76 0.0001 198
C 822 180.04 24.19 183.13 33.51 0.46 0.4556 198

Node activation and visualization of the network when retrained

Only Retraining the Last Layer

Mean Pre shift Sd Mean Reversal shift Sd Mean Non-reversal shift Sd F Value p df
UC 842 50.24 49.23 40.38 24.14 390.75 142.68 586.2 >0.001 198
UC 822 167.34 178.72 150.31 190.53 485.21 56.55 283.94 >0.001 198
C 842 50.03 10.23 48.41 12.5 336.27 158.72 326.89 >0.001 198
C 822 74.08 22.7 66.22 10.24 463.17 105.97 1390.21 >0.001 198
## # A tibble: 1,600 × 4
##    type   shift value retraining_type
##    <chr>  <chr> <dbl> <chr>          
##  1 UC_842 rev      33 last_layer     
##  2 UC_842 nrev    500 last_layer     
##  3 UC_842 rev      42 last_layer     
##  4 UC_842 nrev     51 last_layer     
##  5 UC_842 rev      35 last_layer     
##  6 UC_842 nrev    500 last_layer     
##  7 UC_842 rev      45 last_layer     
##  8 UC_842 nrev    214 last_layer     
##  9 UC_842 rev      31 last_layer     
## 10 UC_842 nrev    500 last_layer     
## # … with 1,590 more rows
##                  Df   Sum Sq Mean Sq F value   Pr(>F)    
## retraining_type   1  1812524 1812524   68.63 1.83e-15 ***
## Residuals       398 10510759   26409                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##                  Df   Sum Sq Mean Sq F value   Pr(>F)    
## retraining_type   1   796735  796735   17.48 3.58e-05 ***
## Residuals       398 18144122   45588                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##                  Df  Sum Sq Mean Sq F value Pr(>F)    
## retraining_type   1 1761991 1761991   104.9 <2e-16 ***
## Residuals       398 6687065   16802                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##                  Df  Sum Sq Mean Sq F value Pr(>F)    
## retraining_type   1 3521815 3521815   153.9 <2e-16 ***
## Residuals       398 9105344   22878                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Training the last layer

Possibility to get both?

After non-reversal shift and if only the first layer weight matrix is transferred, both stimulus dimensions are represented in the middle layer.

##              Df Sum Sq Mean Sq F value   Pr(>F)    
## shift         1   3354    3354      54 5.14e-12 ***
## Residuals   198  12297      62                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

When initialized with a perfect representation, and the shifts conducted on the last layer only, the reversal shift is slower (mean = 42.85, sd = 8.7586898) than the nonreversal (mean = 34.66, sd = 6.8919804) one.

Not resetting last layer

Mean Pre shift Sd Mean Reversal shift Sd Mean Non-reversal shift Sd F Value p df
UC 842 43.63 13.09 48.7 55.12 303.58 159.97 226.93 >0.001 198
UC 822 151.72 178.86 113.08 152.25 431.58 134.01 246.59 >0.001 198
C 842 50.35 11.12 65.48 22.09 100.26 52.82 36.9 >0.001 198
C 822 72.31 22.66 113.37 48 186.68 96.17 46.52 >0.001 198

Resetting last layer

Mean Pre shift Sd Mean Reversal shift Sd Mean Non-reversal shift Sd F Value p df
UC 842 47.97 47.32 50.91 50.97 66.06 77.96 2.65 0.105 198
UC 822 155.38 182.77 110.53 122.35 255.15 186.27 42.11 >0.001 198
C842 53.57 11.4 58.43 27.33 45.57 26.18 11.55 0.001 198
C822 71.85 19.73 123.11 55.31 135.57 97.18 46.52 >0.001 198

Shifts with reversal shift as initiator to train the recognition of second stimulus dimension. Different results depending whether the last layer is re-initialized or not before shift - Note: Only in the first two steps, all layers are trained, in the steps 3.1 and 3.2 only the last layer is retrained. The ultimate shifts (3.1 and 3.2) are done on and with respect to the pre-trained model from step 2.

In comparison: I if the last layer is reinitialized:

This is true for the constrained and the unconstrained version. In the simple stimulus, the dimensions are already separated.

Complex stimuli

  1. Train one-layer network on one stimulus dimension (e.g. Circle is correct)

  1. Retrain same layer on second stimulus dimension (e.g. Big is correct)

  1. Train new tiny layer on first stimulus dimension on top of first layer:

  1. Retrain new layer on second stimulus dimension:

The dimensions are now well represented in the deeper layer of the network in the sense that different neurons tend to respond to different dimensions.

Reversal as well as non-reversal shift are now very fast by only retraining the last layer.

Iteration frequencies when all layers are retrained.

## [1] 100
##              Df   Sum Sq Mean Sq F value   Pr(>F)    
## shift         1  7508200 7508200   17.34 4.65e-05 ***
## Residuals   198 85709637  432877                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Mean Pre shift Sd Mean Reversal shift Sd Mean Non-reversal shift Sd F Value p df
1240.91 371.1 1164.58 756.38 777.07 541.89 17.34 >0.001 198

Iteration frequencies when only last layer is retrained

## [1] 100
##              Df    Sum Sq   Mean Sq F value Pr(>F)    
## shift         1 157098448 157098448    1380 <2e-16 ***
## Residuals   198  22538753    113832                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Mean Pre shift Sd Mean Reversal shift Sd Mean Non-reversal shift Sd F Value p df
1287.28 330.04 227.44 477.14 2000 0 1380.09 >0.001 198

Frequencies of learning iterations when pretrained over several cycles as shown above. Reversal shift: small-big, nonreversal shift: small-triangle

##              Df   Sum Sq  Mean Sq F value   Pr(>F)    
## shift         1 20278259 20278259   57.58 1.24e-12 ***
## Residuals   198 69735723   352201                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Mean Pre shift Sd Mean Reversal shift Sd Mean Non-reversal shift Sd F Value p df
59.22 32.07 70.55 147.81 707.39 826.17 57.58 >0.001 198

When pretrained over several cycles (biased pretraining) and inverted stimuli

(Reversal shift: triangle-circle, nonreversal shift: triangle-big)

##              Df   Sum Sq  Mean Sq F value Pr(>F)    
## shift         1 39322486 39322486   111.1 <2e-16 ***
## Residuals   198 70104708   354064                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Mean Pre shift Sd Mean Reversal shift Sd Mean Non-reversal shift Sd F Value p df
1002.1 689.39 971.09 731.33 1857.91 416.28 111.06 >0.001 198

Digital encoding initial stimuli (reversal shift: big-small, nrev shift: big-triangle)

## Warning in if (is.na(n_hidden)) {: the condition has length > 1 and only the
## first element will be used

## Warning in if (is.na(n_hidden)) {: the condition has length > 1 and only the
## first element will be used

## Warning in if (is.na(n_hidden)) {: the condition has length > 1 and only the
## first element will be used
##              Df    Sum Sq   Mean Sq F value Pr(>F)    
## shift         1 148248702 148248702   145.7 <2e-16 ***
## Residuals   198 201512315   1017739                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##              Df  Sum Sq Mean Sq F value Pr(>F)    
## shift         1 5384121 5384121   419.3 <2e-16 ***
## Residuals   198 2542711   12842                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Mean Reversal shift Sd Mean Non-reversal shift Sd F Value p df
All layers 2580.2 1007.2 858.29 1010.46 145.66 >0.001 198
Last layer 195.57 43.58 523.72 154.22 419.26 >0.001 198

Digital encoding reverted stimuli (reversal shift: circle-triangle, nrev shift: circle-big)

##              Df    Sum Sq   Mean Sq F value Pr(>F)    
## shift         1 666778510 666778510   10373 <2e-16 ***
## Residuals   198  12728014     64283                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##              Df   Sum Sq  Mean Sq F value Pr(>F)    
## shift         1 14288789 14288789   187.7 <2e-16 ***
## Residuals   198 15074684    76135                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Mean Reversal shift Sd Mean Non-reversal shift Sd F Value p df
All layers 3869.26 320.48 217.47 160.8 10372.56 >0.001 198
Last layer 891.37 351.75 356.79 168.94 187.68 >0.001 198